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Abstract —This paper details a data structure for managing 
and scheduling requests for computing resources of clusters 
and virtualised infrastructure such as private clouds. The data 
structure uses a red-black tree whose nodes represent the start 
times and/or completion times of requests. The tree is enhanced 
by a double-linked list that facilitates the iteration of nodes 
once the start time of a request is determined by using the 
tree. We describe the data structure main features, provide an 
example of use, and discuss experiments that demonstrate that 
the average complexity of two operations are often below 10% 
of their respective theoretical worst cases. 

I. Introduction 

Advances in IT have led to the emergence of virtualisation 
and provisioning models where resources are provided to client 
applications on demand. Under such models, often termed as 
cloud computing [T], customers request resources to run their 
applications and pay only for what they consume. Techniques 
to manage the allocation of resources and schedule user 
requests are often at the core of cloud management systems. 

Although on-demand provisioning was initially the only 
model offered by clouds, other approaches such as reserved 
virtual machineijj and spot instance^] have later gained pop¬ 
ularity. Resource reservations are of interest to users as they 
provide means for reliable allocation and enable users to plan 
the execution of their applications. Previous work in large-scale 
computing infrastructure demonstrates that certain deadline- 
constrained applications demand predictable quality of service 
12), often requiring a number of computing resources to be 
available over a well defined period, commencing at a specific 
time in the future; good requirements for advance reservation. 

For scheduling decisions, management systems generally 
maintain information on resource availability in data structures 
or databases 0 - The systems may need to handle numerous 
requests per minute, with each request arrival or completion 
triggering scheduling operations requiring multiple reads or 
updates to the data structure. Efficient data structures are 
essential to timely check whether reservations or ordinary 
requests can be accommodated, or to provide alternatives to 
users with flexible requests 0 - Several of these operations are 
generally referred to as admission control. 

This paper describes a data structure for storing information 
on computing resource availability and performing admission 

1 http://aws.amazon.com/ec2/purchasing-options/reserved-instances/ 
"http://aws.amazon.com/ec2/purchasing-options/spot-instances/ 


control of ordinary requests and reservations of computing 
resources. The data structure uses a red-black tree; a binary 
search tree with one additional attribute per node: its colour, 
which can be either red or black 0 - The structure is enhanced 
by a double linked list used to iterate nodes that contain the 
resource availability when checking whether a request can be 
accepted. The data structure, termed as “Availability Profile”, 
or “Profile” for short, maintains information on resources 
available when requests start or complete. 

The rest of this paper is organised as follows. Section [B] 
describes background and related work. In Section [HI] we 
introduce the data structure, whereas Section[|V]illustrates how 
it can be used to build scheduling policies. Section [V] contains 
results on evaluating the practical average complexity of two 
operations, and Section [VI] concludes the paper. 

II. Background and Related Work 

A. Resource Reservations 

Although initially oriented towards on-demand provision¬ 
ing, cloud computing solutions have later introduced other 
means to provide resources to client applications, includ¬ 
ing frameworks that enable advance and immediate resource 
reservations]^] In the past, other systems have benefitted from 
reservations, including grids where large-scale experiments can 
demand co-allocation of resources across several sites ©■ 

Systems that manage request scheduling and resource allo¬ 
cation, generally employ a data structure to store information 
on resources available until a particular time in the future. 
The structure is examined to check whether a request can 
be admitted or not. This period over which the availability 
information is stored depends on the resource allocation policy 
in use. For example, for an allocation policy that schedules 
requests using conservative backfilling |7 ] and reserves re¬ 
sources as requests arrive, this period may vary according 
to the number of requests currently in the system. Under 
aggressive backfilling (S|, this period is often shorter as the 
scheduler maintains details on running requests and about the 
first waiting request. 

B. Data Structures Using Slotted Time and Non-Slotted Time 

In slotted-time data structures a period over which the 
availability information is divided in time frames of equal 
length If accepted, a request is allocated a number of 

3 https://wiki.openstack.org/wiki/Blazar 
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consecutive slots for a period long enough to accommodate it 
(i.e a number of slots whose total duration is equal or greater 
than time frame initially requested). 

The data structure presented here does not use slotted time, 
thus allowing for finer time granularity for accepted requests 
as the duration of allocated time frames does not need to be 
multiple of slot length. As discussed next, the structure uses 
ranges of available resources as it needs to ensure that the same 
resources are allocated to a request during its whole execution. 
With slotted time and short slots, the profile would have this 
range information replicated at all time frames, and iterating 
slots would be time consuming. 

III. The Availability Profile 

The proposed data structure follows the concept of avail¬ 
ability profile |(71, which utilises a list whose entries contain 
information about the ranges of resources available after start 
and completion of requests. This paper enhances the concept 
of availability profile by: 

• allowing it to maintain information about reservations; 

• using a red-black tree to search for the start of a 
free time frame suitable for scheduling a request, thus 
reducing the complexity from 0[n) when using a 
sorted list to 0(log n) by using the tree; and 

• storing information about the ranges of resources 
available at each node, hence enabling various policies 
to select time frames, such as first-fit, best-fit and 
worst-fit. 

A red-black tree is approximately balanced due to the 
manner nodes are coloured from the root to a leaf, which 
ensures that no path is more than twice as long as any other. 
After modifying a red-black tree, rotation and colour change 
operations guarantee that it remains approximately balanced. 
The nodes of the tree contain information about resources 
available at specific times in future; the start and/or completion 
times of requests. The profile utilises ranges of resources 
as it needs to know whether the selected resources would 
be available over the entire period requested. For instance, 
a cluster with 10 resources has a range from 0 to 9 (i.e. 
[0..9]). This contrasts with data structures that store bandwidth 
available on a network link, as in the latter the availability at 
a time is generally a single number |9(|. The profile needs to 
ensure that the same resources are available over the period 
requested as starting a request on a set of resources and 
migrating it several times during execution is undesirable. Here 
a resource represents a slot (i.e. a combination of number of 
vCPUs, memory and storage) to run a virtual machine, but the 
structure is generic enough to allow a scheduler to work with 
other types of abstractions. 
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Fig. 1. Pictorial view of the data structure: (a) scheduling queue of a cluster 
with 13 resources; (b) availability information as a red-black tree; and (c) 
information stored by the nodes. 


best-effort requests and reservations. Although a best-effort 
request starts as soon as enough resources are available, a 
reservation requires resources over a well-defined time frame. 
The operations for obtaining a time frame to accommodate 
a request are detailed later. As requests are inserted, the 
profile is updated to reflect the new resource availability. One 
node is inserted containing the time at which the request is 
expected to complete, the number of resource available after 
its completion, and the ranges of resources available once it 
completes. 

Figure [I] (b) illustrates the resulting red-black tree, where 
shaded circles are black nodes. Each node represents a time 
and contains the information presented in Figure [T] (c), where 
dashed lines are the linked list connecting sibling nodes. For 
the profile presented in Figure [2] as an example, to accept a 
reservation request whose start time is 220, finish time is 700, 
and requires 2 resources, the algorithm: 


The red-black tree is used to locate the node that represents 
the start of a request, termed as anchor, whereas the double- 
linked list is employed to iterate nodes once the anchor 
is found. By using the list, all nodes until the supposed 
request completion time are verified to check whether there are 
resources available to admit the request into the system. We 
provide an example of a cluster of 13 resources to depict how 
the data structure works (see Figure [T}. Figure [I] (a) shows 
the scheduling queue at time 0 — the queue contains both 


1) Obtains from the reservation the start time, finish 
time, and number of resources required. 

2) Uses the reservation start time to find the node (i.e. 
the anchor) whose time precedes or is equal to the 
reservation’s start time. If the anchor does not have 
enough resources, then the request is rejected. 

3) Examines the ranges if the anchor has enough re¬ 
sources to serve the reservation. Then, uses the list 
to iterate the tree and examine all nodes whose times 






























Fig. 2. Iterating a tree using the linked list to perform admission control of 
a reservation request to start at 220 and finish at 700. 

are smaller than the reservation’s finish time. For 
each node, the algorithm computes the intersection 
of the node’s ranges with the intersection of ranges 
from previously examined nodes. If the the resulting 
intersection has enough resources to serve the request, 
then it is accepted. 

4) Stops and rejects the request whenever a computed 
range intersection does not have enough resources. 

Figure [3] (a) illustrates the relevant part of the profile 
represented as lists of resource ranges available over time, 
whereas Figure [3] (b) depicts the actual scheduling queue 
with the corresponding reservations. The new request can be 
accepted in this case because the intersection of ranges has 
more resources than what is required. 

A. Operations 

The implementation of the availability profile contains 
several operations to: 

• Check whether a reservation with strict start and finish 
times can be accommodated. 

• Find a time frame over which a request with flexible 
start and finish times can execute. 

• Obtain the availability information (i.e. free time 
frames) in the profile. 

• Get the scheduling options for a request or reservation, 
which are important for schedulers based on strategies 
such as best-fit and worst-fit. 

• Add time frames to a profile when requests are can¬ 
celled or paused, or new resources are added to a pool. 

• Reconstruct availability profiles from time frames. 

• Allocate time frames to requests. 

Next we describe two operations namely to check resource 
availability (i.e. to decide whether a request can be served) and 
to update the profile by allocating the resource ranges assigned 
to the request. 

1) Check Availability: As discussed earlier, we consider 
two types of requests; reservations, that require resources at a 
well-defined time frame; and best-effort requests that accept 
resources as they become available. 

As mentioned earlier, the process of checking whether a 
reservation request can be accommodated, comprises of first 
finding the anchor by using the tree, and then iterating the list 
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start time: 220, finish time: 700 and 
number of required resources: 2 




Time 

Fig. 3. Part of a scheduling queue as (a) ranges of available resources and; 
(b) reservations. To accommodate a reservation request, the intersection of 
available ranges must have enough resources. 

to check all nodes lying within the anchor and the last node 
before the requested finish time. The worst-case scenario for 
checking whether a reservation can be admitted into the system 
is 0(log n + m) or 0(m), where log n is the cost of finding 
the anchor node in the tree and m is the number of nodes of 
the sub-list between the anchor and the last node before the 
request finish time. 

To schedule a best-effort request, which can be served at 
any time, the algorithm can start iterating the tree using the 
current time as start time. Different from the admission of 
a reservation, however, to find a time frame for a best-effort 
request the algorithm starts with a potential anchor; a node with 
enough resources to serve the request. The intersection of the 
potential anchor’s resource ranges with the following nodes’ 
ranges until before the expected completion of the request 
needs to have enough resources to accommodate the request. 
The worst-case scenario for this operation is 0(log n + m 2 ) 
or 0(m 2 ), where log n is the cost of finding the first anchor 
in the tree and m is again the number of nodes of the sub-list 
between the first potential anchor until the end of the list. The 
pseudo-code for this procedure is depicted in Algorithm [l] 

The profile also provides operations for the scheduler 
to obtain the free time frames. A free time frame contains 
information about the resources available over a given time 
interval. A time frame has a start time, a finish time, and the 


























































Algorithm 1: Find a time frame to accommodate a request. 

input : the request’s duration and number of resources (reqRes) 
output: a profile entry with the request’s start time and available ranges 


2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 


dime the current time 

iter 4 — profile iterator starting at the node preceeding dime 
intersec <— null 

pstime <— dime II request’s potential start time 

pftime < -1 II request’s potential finish time 

anchor <— null 

while iter has a next element do 

anchor 4 — the next element of iter 
if anchor.nRes < reqRes then 
| continue 
else 

// a potential anchor has been found 

pstime <— anchor.time II potential start is anchor’s time 

pftime pstime + duration II potential finish time 

inter sec <— anchor.ranges II intersections of ranges 

ita <— profile iterator starting after pstime 

while ita has a next element do 

nxnode 4 — the next element of ita 
// does not check nodes beyond potential finish time 
if nxnode.time > pftime then 
break 

else 

if nxnode.nRes < reqRes then 
// not enough resources available 
intersec <— null 

break 


27 

28 

29 

30 


intersect intersect fl nxnode.ranges 
if intersec.nRes < reqRes then 
// not enough resources available 

break 


31 

32 

33 


if inter sec.nRes > reqRes then 

// found time frame with enough resources 

break 


that can accommodate a potential extension or renewal of the 
resulting resource lease. 

2) Updating the Profile: Once a time frame for a request 
is found, and it is accepted, the availability profile needs to be 
updated accordingly, which consists of: 

• Updating the anchor or inserting a new node if the re¬ 
quest’s start time does not coincide with the anchor’s. 

• Updating all entries from the anchor until before 
the request’s completion time, removing the selected 
resource ranges. 

• Inserting a new node marking the completion time of 
the request containing the ranges available once the 
request completes. 

To minimise the number of nodes in the tree, requests with 
the same start time or completion time share nodes. The worst- 
case complexity of the update operation is 0(log n + m) or 
0(m) as it consists in inserting one element in the tree ( i.e. 
log n ) and updating the m nodes until before the completion 
of the request, iterating the linked list. 
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Fig. 4. Example of a profile with two resource partitions. 


ranges of resources available. The availability profile has two 
operations to obtain the free time frames. The first operation 
returns free time frames that do not overlap with one another; 
similar to the approach used by Singh et al. |[2) in their 
extended conservative backfilling policy. The complexity of 
this operation is 0{log n + m 2 ) or 0(m 2 ) in the worst-case 
scenario, where log n is the cost of finding the the anchor 
for the query’s start time and m is the number of nodes in 
the list between the anchor node and the last node before the 
query’s end time. In real scenarios, however, this operation is 
not invoked often, as the operations required to check whether 
a request can be admitted generally do not need to obtain a 
list of free time frames. 

The second operation to query free time frames returns 
a list of scheduling options J3), where time frames overlap. 
The returned time frames are termed as scheduling options 
because they represent positions in the queue where a given 
request or reservation can be placed. This operation is useful 
when a scheduler needs to perform more complex selection of 
resource ranges for a best-effort or reservation request. For 
example, in some systems the users are allowed to extend 
previous reservations or resource leases ED- The scheduler 
may be required to select a free time frame for a reservation 


B. Multiple Resource Partitions 

An availability profile that controls the allocation of re¬ 
source ranges to different resource partitions or pools 03 is 
also provided. This data structure, termed as partitioned profile, 
is depicted in Figure [4] where nodes store the ranges available 
at more than one resource partition. A user can check the 
availability of a given partition as well as update that particular 
partition. As this profile is just an extension of the previously 
described structure, it is possible to create allocation polices 
based on the partitioned profile, that allow a partition to borrow 
resources from another. To enable borrowing, a user uses the 
operations offered by the normal profile. 


C. Implementation Details 

The data structure has been implemented both in Java 
and Python. An early version has been included in GridSim 
d); a grid simulation toolkit that enables modelling and 
simulation of clusters of computers, grids, storage devices 
and network topologies. The structure has also been used by 
resource allocation policies in previous work j 14 1, 03 and in 
schedulers of system prototypes 03- 








































IV. Using the Profile 


V. Experimental Setup and Results 


We show here how to use the profile to build a conservative 
backfilling scheduler |7J, but it should be straightforward 
to implement other policies. The example also demonstrates 
how to obtain the scheduling options for a request so that a 
scheduler can select resources using approaches such as best-fit 
and worst-fit to minimise a queue’s fragmentation and improve 

resource utilisation (D-0. CCD- COD- 


Algorithm 2: Sample conservative backfilling scheduler. 

l procedure reqSubmitted(Req r) 

3 success <— startReq(r) 

4 if success = false then 

5 success enqueueReq(r) 
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procedure startReq(Req r) 

dime <— gets current time 

anchor <— prof ile. check (j.n Res, dime, r.duration) 
if anchor without enough resources then 
return false 

else 

sis 4— select ranges from anchor 
prof ile. allocate{sls, dime, r. duration) 
r.ranges «— sis 
r. starttime dime 

return true 


is procedure enqueueReq(Req r) 

10 // search for an anchor 

21 anchor <— prof ile.checker.nRes, r.duration) 

22 sis <— select ranges from anchor 

23 prof ile.allocate(sls, anchor.time, r.duration) 

24 r.ranges <— sis 

25 r. starttime <— anchor.time 

26 procedure resSubmitted(Reserv r) 

28 success <— admit Res erv(r) 

29 if success = false then 

30 options <— 

prof ile.getOptions(r.starttime, r.nRes, r. duration) 
II send scheduling options to user 

32 procedure admitReserv(Reserv r) 

34 anchor <— prof ile.checker.nRes, r. starttime, r.duration) 

35 if anchor without enough resources then 

36 | return false 

37 else 

38 sis <— select ranges from anchor 

39 prof ile.allocate(sls, dime, r.duration) 

40 r.ranges «— sis 

41 r. starttime <— r. starttime 

42 return true 


Algorithm [2] shows the scheduler main operations where 
a request is scheduled as it arrives |7j. Operation reqSubmit- 
ted(Req r), called when a request r arrives, tries to start r im¬ 
mediately by calling startReq(Req r). Procedure startReq(Req 
r) executes the same steps required to admit a reservation (i.e. 
admitReserv(Reser\’ r)) as all best-effort requests are initially 
treated as immediate reservations when they arrive. If a request 
cannot start immediately, the scheduler finds an anchor with 
the time at which the request can start; procedure depicted by 
enqueueReq(Req r). Once the anchor is found, the scheduler 
selects the resource ranges and updates the profile accordingly. 


We evaluated the practical and average complexities of two 
operations, namely (i) assessing the resource availability to 
find a time frame where a request can be placed (i.e. schedule 
operation), whose theoretic worst case is 0(m 2 ), where m is 
the number of elements in the sub-list after a first potential start 
time is found using the RB tree; and (ii) checking whether a 
given advance reservation can be granted (i.e. check operation), 
whose complexity is 0(m) where m is the number of nodes 
after the start time, as discussed in Section m 

We used a discrete event simulator developed in house 
to the model and simulate various scheduling policies using 
the data structure here. For the experiments reported here, we 
used conservative backfilling and considered two scenarios, 
one with a cluster of 1152 CPU cores and another with 
446 CPU cores; the latter, in additional to normal requests, 
permits advance reservations. Although the first scenario does 
not allow reservations, the scheduler considers a request that 
arrives initially as a reservation starting immediately, and 
hence uses the check operation to assess the current resource 
availability. If the scheduler does not find required resources 
available, it uses the schedule operation to find a suitable 
time frame for the request. To drive the workload for the first 
cluster we obtained one year of request traces (from Jan. to 
Dec. 2002) from the SDSC Blue Horizon machin^] For the 
second scenario, we used one year of request logs (from Jan. 
to Dec. 2013) from four clusters at the Lyon site of Grid’5000 
G3- We ignored the results of the first and last 4,000 calls 
to each operation to minimise the impact of warm up and 
cooldown phases. In both scenarios, more than 60,000 calls to 
each operation are taken into account. 

Figure [5] summarises the average complexity for check 
and schedule operations by showing the percentage of nodes 
visited while iterating the list, considering what would be the 
worst case for each access. Although the average complexity 
is in general low when compared to the theoretical worst 
case, it is in general higher for Blue Horizon due to larger 
fragmentation of its queues. Requests made at the Grid’5000 
clusters generally span several hours. 
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Fig. 5. Average complexity of check and schedule operations. 


4 Parallel Workloads Archive: http://www.cs.huji.ac.il/labs/parallel/workload/ 























Though outliers are not ploted in Figure [5] to improve 
readability, we noticed that under certain cases the worst case is 
reached, particularly for the check operation in Blue Horizon. 
However, after a more detailed inspection, we observed that 
the worst case is approached when the number of entries to 
be evaluated is small. For Blue Horizon, Figure [6] shows a 
histogram of the number of entries that are not visited while 
the check operation iterates the list. 



Number of non-visited entries 


Fig. 6. Non-visited entries for the check operation on Blue Horizon. 

The histogram shows that the number of non-visited entries 
is often small, thus demonstrating that even though the average 
complexity of the check operation is higher for Blue Horizon, 
the evaluated entry set at each iteration is generally small. 

VI. Conclusion 

This paper presented a data structure to facilitate the 
scheduling of requests by cloud resource management systems. 
We provided details about the data structure, which uses a red- 
black tree to find a potential start time for reservations and a 
double-linked list to iterate the tree’s nodes. We provided an 
example that demonstrates how the availability profile can be 
utilised to create scheduling policies and generate alternative 
offers for advance reservation requests. Experimental results 
show that the average practical complexity of two operations 
is often below 10% of their respective theoretical worst cases. 
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